Insight into Mantle Cell Lymphoma Pathobiology, Diagnosis, and Treatment Using Network-Based and Drug-Repurposing Approaches

Mantle cell lymphoma (MCL) is a rare, incurable, and aggressive B-cell non-Hodgkin lymphoma (NHL). Early MCL diagnosis and treatment is critical and puzzling due to inter/intra-tumoral heterogeneity and limited understanding of the underlying molecular mechanisms. We developed and applied a multifaceted analysis of selected publicly available transcriptomic data of well-defined MCL stages, integrating network-based methods for pathway enrichment analysis, co-expression module alignment, drug repurposing, and prediction of effective drug combinations. We demonstrate the “butterfly effect” emerging from a small set of initially differentially expressed genes, rapidly expanding into numerous deregulated cellular processes, signaling pathways, and core machineries as MCL becomes aggressive. We explore pathogenicity-related signaling circuits by detecting common co-expression modules in MCL stages, pointing out, among others, the role of VEGFA and SPARC proteins in MCL progression and recommend further study of precise drug combinations. Our findings highlight the benefit that can be leveraged by such an approach for better understanding pathobiology and identifying high-priority novel diagnostic and prognostic biomarkers, drug targets, and efficacious combination therapies against MCL that should be further validated for their clinical impact.

The majority of MCL cases (>95%) feature the abnormal chromosomal translocation t(11;14)(q13;q32)(IgH/CCND1), leading to constitutive cyclin D1 (CCND1) overexpression inducing cell-cycle deregulation, along with deregulation of DNA damage and repair mechanisms and transcriptional alterations [3][4][5].However, overexpressed CCND1 is not always pathognomonic for MCL initiation, development, and progression (CCND2, CCND3, CCNE translocations may alternatively deregulate cell cycle progression) [4], and comprehensive profiling at the genome, epigenome, transcriptome, and proteome level is often necessitated, although still far from clinical practice [3,6,7].The frequently occurring relapses after initial treatment with increasing resistance to chemotherapy make MCL the B-cell lymphoma with the worst clinical outcome, [3,8,9] while only a very small percentage of cases show an indolent clinical course [9].Therefore, it is critical to define the various molecular subgroups of MCL to investigate putative risk-adjusted therapy strategies.
Presently, high-throughput transcriptomic and proteomic analyses have been instrumental in revealing constitutively activated and/or deregulated proteins and pathways in MCL [10][11][12], as well as potential prognostic factors [13][14][15].In this context, the advent of drug repurposing (DR) as a worthwhile therapeutic strategy delivering "second chances" for the reappraisal of specific disease-oriented and molecular-targeted drugs, opens avenues of enhanced opportunities for MCL patients towards the systematic improvement of health outcomes and health-care costs [16,17].However, despite best efforts, intuitive computational approaches that piece together the jigsaw puzzle-like information contained in such publicly available big data are still lacking.Our approach leverages a promising integrative bioinformatic pipeline aimed at understanding MCL pathobiology through network-based methods and drug-repurposing strategies to uncover drug targets and propose drugs that might be successfully repurposed against MCL.

Results
Publicly available comparative transcriptomic data on mRNA relative abundance in tissue and blood samples of MCL patients and normal individuals were carefully selected and reanalyzed following a modular pipeline of network-based methods (Figure 1).Five microarray studies were revisited, as described in the Section 4. In the following paragraphs, the findings of this analysis are described in detail.

Multidimensional Scaling Analysis and Hierarchical Clustering Uncovers Distinctive Profiles in MCL vs. Healthy Donors and Increased Heterogeneity in MCL Stages
Multidimensional scaling (MDS) analysis of the 500 most variable transcripts shows that normal (benign lymphadenitis, BL) and the in situ MCL stage (InS) are well separated considering the leading dimensions of the analysis (Supplementary Figure S1).On the contrary, the classical stage (cMCL) seems to overlap with the aggressive and intermediate stages (Supplementary Figure S1; intMCL, agrMCL).RNA extracted from the peripheral blood (PB) B cells of five typical MCL patients and eight healthy individuals (MCL1-n, H1-n samples; Supplementary Figure S2) suggests that MCL transcriptional profiles are more disparate compared to the healthy ones, implying lymphoma heterogeneity.The dispersion of MCL cases is even more pronounced when more MCL cases are included in the MDS analysis (Supplementary Figures S3B and S2).Newly diagnosed MCL patients exhibit higher similarity based on the five analyzed cases (samples in cyan, Supplementary Figure S3B).
Hierarchical clustering heatmaps of the 200 most variable genes are in accordance with the MDS analysis results.Clustering of the MCL stepwise stages generally agrees with the original labeling of the samples (Supplementary Figure S4A); normal, in situ, and intermediate samples are grouped together, whereas classical stage samples are similar to the average profile of the aggressive group (Supplementary Figure S4A).Typical MCL cases exhibit distinct profiles compared to the healthy ones (Supplementary Figure S4B).Notably, batch effect correction reveals distinct clustering of patients' transcription profiles, fitting well with the five newly diagnosed patients (Supplementary Figure S3A).Module A Module B First, transcriptional data were preprocessed, then comparative analysis was performed between pairs of different sample groups.SVM models were also trained using the most variable genes of each sample group, and MCL and samples were recategorized accordingly.The Limma R package was employed and differentially expressed genes (DEGs) were selected based on an adjusted p-value threshold of 0.05.Next, DEGs were used as an input to four independent subsequent analysis: pathway enrichment (performed by two bioinformatic tools pathfindR [18] and PaintOmics [19]), drug repurposing, co-expression network spectral clustering, and compound drug combination prediction analyses.

Comparative Transcriptomic Analysis Reveals Significant MCL Stage-Dependent Deregulation of Gene Transcription
Comparative transcriptomics between sample groups of merged MCL stages vs. healthy controls, as well as stepwise MCL stage comparisons, revealed significant differences in the transcription levels of several genes (Tables 1 and 2; Figure 2A).The detected differences are both qualitative and quantitative, with a pronounced increase in the number of DEGs as MCL progresses to an aggressive stage (Figure 2A, Supplementary Table S4).Progression from normal mantle zone B cells to the in situ MCL phenotype (NtoIS; Supplementary Table S2) comprises a "precursor" set of 27 DEGs.Transition to the subsequent stages leads to a more pronounced increase in the number of DEGs, which are doubled at the final stage (1176 and 2677; IStoI and ItoA, respectively; Figure 2A).Four genes, corresponding to zinc finger protein 219, tumor necrosis factor ligand 4, cannabinoid receptor 1, and B-and T-lymphocyte attenuator, are common between NtoIS and IStoI stage transitions (Figure 2A).Different DEGs seem to be involved in the intermediate and aggressive stage progressions, indicated by their overlap (671 common DEGs; ~52%; Figure 2A).On the other hand, a substantial number of 5467 DEGs was identified when PB B cells of five typical MCL patients were compared against eight healthy ones (Figure 2A; HtoMCLx).Interestingly, the number significantly increased to 9510 DEGs when the number of patient cases was increased (BtoMCL; Figure 2A), possibly emphasizing each patient's unique contributions to cellular and molecular heterogeneity.
We constructed a pathway-pathway network of enriched KEGG pathways in all three stepwise MCL stage comparisons (NtoIS, IStoI, ItoA; Supplementary Figure S8A-C), where the thickness of the edges and the size of the nodes is proportional to the number of shared genes and the FE, respectively.NtoIS transition included eight pathways, all sharing the cyclin D gene (Supplementary Figure S8A).In the intermediate and aggressive stages, additional pathways under the scope of cyclin D were "triggered", including "AMPK", "JAK-STAT", "FoxO", "apelin", "oxotocin" and "AGE-RAGE" signaling pathways and "tight junction" (Supplementary Figure S8B,C).

Spectral Clustering Algorithm Detects Common Modules between Gene Co-Expression Networks
PE analysis employs public repositories that organize established molecular networks, and thus is limited by the current state of knowledge of biological processes involved mainly in physiological conditions.To explore novel MCL-related gene correlations, we constructed co-expression networks of different sample groups (including MCL stages) using the MRNETB algorithm.Networks were compared for common modules using the spectral clustering approach described by Zhang et al. [23,24].Corresponding sample comparisons were made (Table 1), and the number of modules was decided based on the respective eigengaps (Supplementary Figure S7, Supplementary Table S7).
Two common modules were found between the normal and in situ sample groups (Figure 3A).Worth mentioning are the genes: asRNA ATP2B1-AS1 and myosin heavy chain 11 (MYH11), both members of the "VEGFA-VEGFR2 signaling pathway" (Figure 3A).GSEA analysis of the modules identified five wiki pathways, including "TGF-beta receptor signaling" with genes SMAD4/5 (Supplementary Table S7A).Co-expression network spectral clustering.Gene co-expression networks at different MCL stages were constructed using the MRNETB algorithm [25].Modules of co-expressed genes were identified and aligned between networks using a spectral clustering approximation approach [23,24].The union of the aligned modules between (A) normal and in situ, (B) in situ and intermediate, (C) intermediate and aggressive are depicted.Up-and down-regulated genes are depicted as regular or flipped triangles, respectively.The log 2 fold change of the co-expression between MCL stages is used as a weight of the edges.Loss of co-expression is depicted with green (negative log 2 fold change), whereas gain of co-expression with red (positive log 2 fold change).As a result, "green communities" are genes where co-expression relationships are weakened as the MCL progresses to the next stage.The node border line color depicts the Wiki pathway in which a gene participates.A gene can participate in more than one pathway, but here we show only selected pathways (orange = apoptosis, cyan = cytoplasmic ribosomal proteins, magenta = MAPK signaling pathway, blue = PI3K-Akt signaling pathway, yellow = VEGFA-VEGFR2 signaling pathway, brown = cell cycle, purple = TGF-beta signaling pathway).
Comparing in situ and intermediate stages required significant parameter calibration (q threshold, beta parameter) and common modules were not easily identified.We provide two alternative different parameter configurations (Supplementary Table S7A; Figure 3B).
Six aligned modules were discovered between intermediate and aggressive co-expression networks (Figure 3C).GSEA analysis pointed out the prevalence in cytoplasmic ribosomal proteins and eleven genes of the VEGFA-VEGFR2 signaling pathway (Supplementary Table S7).

Drug Repurposing Recommends New Potential MCL Stage-Specific Treatments
Following the identification of individual genes that were differentially transcribed and the corresponding pathways that were enriched, we applied a DR approach that proposes stage-specific FDA-approved drugs (Figures 4 and 5).A drug-gene network was constructed given the drug-target relationships catalogued in DrugBank [26].We considered as new potential treatments the FDA-approved drugs with a "targeting" action upon at least one of the top DEGs.Additional criteria that consider both the type of action (inhibition, antagonist, etc.) and the mode of change (up/down) were applied (Figures 4 and 5, Supplementary Figure S5; Supplementary Methods).In total, 388 FDA-approved drugs are targeting the protein products of the top DEGs (Supplementary Table S6, Figure 4).As proof of concept of our approach, several compounds currently used in the therapy of MCL are among them (Figure 4; diamonds in cyan).In sum, 3 out of the 388 recommended drugs are currently used in MCL (ibrutinib, acalabrutinib, zanubrutinib), 13 and 8 are at clinical trials of MCL and nHL cases, respectively, 2 are administered in T-cell lymphoma patients, 4 in lymphoma in general, and 9 in leukemia (Supplementary Figure S9A).Interestingly, 59 out of the 388 drugs (~18.6%) are used for the treatment of mental illnesses: schizophrenia, psychotic disorders, depressive disorder, bipolar disorder, and ADHD (Supplementary Figure S9B).Fifty-five of these drugs target the serotonin receptor (HTR2A), and three drugs target two proteins: carbonic anhydrase CA2 and synaptic vesicular amine transporter SLC18A2.Other indications of the proposed drugs include breast, colorectal, ovarian, and bowel cancer, and other diseases, such as rheumatoid arthritis, osteoarthritis, and Parkinson's disease (Supplementary Figure S9B).Twenty-three of the molecular entities are dietary supplements, thirty have anti-inflammatory use, nine are analgesic and thirteen are antibiotic agents.Moreover, based on DrugBank's categorization, 31% of the proposed drugs are organoheterocyclic compounds and 18% are nutraceutical, whereas most of them are small molecules or organic compounds (Supplementary Figure S9C,D).
At the initiation of MCL, fewer DEGs are identified, leading to fewer potential drugs discovered (Figures 2A and 4).Arsenic trioxide, an antagonist, and encorafenib, an inhibitor of CCND1 protein, are proposed at the in-situ stage; other compounds discovered are cannabidiol, rimonabant and dronabinol for targeting the overexpressed cannabinoid receptor 1 (CNR1).Cannabidiol is a negative allosteric modulator, rimonabant a potent and selective cannabinoid CB1 receptor antagonist, and dronabinol stimulates both CB1 and CB2 cannabinoid receptors.
More potential protein targets are found at the aggressive stage of MCL, leading to a greater variety of new drugs and targeted biological pathways (Figure 4).Seventy-five of the proposed MCL drugs target HTR2A, forty prostaglandin G/H synthase 1 (PTGS1), twenty-one DNA topoisomerase 2-alpha (TOP2A), fourteen amyloid-beta precursor protein (APP), fourteen carbonic anhydrase 2 (CA2) and thirteen vascular endothelial growth factor (VEGFA).
In sum, 116 of the proposed drugs target proteins at the two final stages of MCL (Figure 4).Interestingly, among them are metabolites (oxygen, NADH, ATP, serotonin molecules), and vitamins (vitamin E and derivatives, vitamin A) or zinc used as nutrition supplements.In total, 21 metabolites and nutrients were proposed as "drugs" (Figures 4 and 5, Supplementary Table S6).
Fostamatinib, a compound of interest, is currently in clinical trials for the treatment of lymphoma (including MCL) and leukemia.Fostamatinib is a prodrug converted to its active metabolite R406 in the intestine.R406 is a tyrosine kinase inhibitor exhibiting activity against spleen tyrosine kinase SYK, a key player of the BCR pathway [27,28] (Table S1).SYK was found to be overexpressed in the 63 MCL patient samples versus normal B cells.
Additional potential targets of R406 are Bruton tyrosine kinase (BTK) and other tyrosine kinases, such as FYN, CSF1R and PTK2B [28].FDA-approved drugs proposed in the current analysis for the treatment of MCL.The potential drugs were obtained using DrugBank's [26] catalogue of FDA-approved drugs and their gene/protein targets.Drug-gene pairs were selected based on the top DEGs identified in nine sample group comparisons (Table 1) and filtered given the type of action.Genes and drugs are depicted as circles and diamonds/triangles, respectively.Two types of drugs exist in the DrugBank database [26], the small molecules (diamonds) and the biotech molecular entities (arrowheads).Drugs are colored based on their original indication or clinical trial: MCL drugs (cyan), T-cell lymphomas (orange), other lymphoma/leukemia subtypes (green), drugs at clinical trial for the treatment of MCL [28]    As proof of our approach, a considerable number of compounds are indicated in MCL (cyan), T-cell lymphoma (orange), other lymphomas (green), leukemia (gray), R/R lymphoma, or other cancer treatments.Some of them are at clinical trials for the treatment of MCL (magenta) or other subtypes of lymphoma (blue).Our approach also identifies groups of compounds originally prescribed in other diseases such as schizophrenia or bipolar disease (DD: depressive disorder; RA: rheumatoid arthritis or osteoarthritis; PD: Parkinson's disease).
Aldesleukin (non-glycosylated human interleukin 2), produced using genetically engineered E. coli strains, is documented as an agonist of the IL2RG cytokine receptor and is proposed at the intermediate stage (Figure 4; Supplementary Table S6).Aldesleukin in combination with cyclophosphamide, fludarabine or immunotherapy is currently clinically being tested in treating R/R MCL or other types of nHL (Table S1, trial NCT00924326).Denileukin diftitox is a recombinant DNA-derived immunotoxin administered against malignancies expressing the IL-2 receptor (e.g., cutaneous T-cell lymphoma).
Aflibercept, bevacizumab and minocycline are three out of thirteen compounds targeting vascular endothelial growth factor (VEGFA), which is active in angiogenesis.Binding of VEGFA to its cell-surface receptors FLT1 and VEGFR1 leads to a cascade of signaling pathways: the PI3K-Akt pathway, which leads to increased endothelial cell survival, or the MAPK pathway, which initiates DNA synthesis and cell growth.VEGFA and FLT1 are up-regulated in MCL patients when compared to normal B cells (BtoMCL; Supplementary Table S4).
Eighteen proposed drugs are approved in other types of cancer, and fourteen of them are highlighted by the comparison of MCL cases diagnosed early against advanced (MCL(e)toMCL).Catumaxomab hybrid monoclonal antibody is proposed for the intermediate stage; it binds to antigens CD3 and treats malignant ascites, a condition that develops in cancer patients.Cabazitaxel is an antineoplastic agent used for the treatment of hormone-refractory metastatic prostate cancer [29], which is proposed by the current analysis for the treatment of aggressive MCL (Figure 4).Podofilox, epirubicin and dexrazoxane are DNA topoisomerase 2-alpha (TOP2A) inhibitors and are proposed by the current analysis for treating advanced-stage MCL (BcelltoMCL).Epirubicin and dexrazoxane are cytotoxic anthracyclines.The latter is used to prevent and improve cardiomyopathy and combined with doxorubicin is used for the treatment of metastatic breast cancer [30,31].Podofilox, found in the podophyllin resin from the roots of podophyllum plants and used for treating genital warts, has potential antineoplastic properties and antiproliferation effects in gastric cancer [32].Gemcitabine, methotrexate, capecitabine are proposed in our DR analysis for inhibiting the up-regulated thymidylate synthase TYMS, which has a significant role in the de novo mitochondrial biosynthesis of thymidylate, an essential precursor for DNA biosynthesis.Methotrexate, an antifolate used in rheumatoid arthritis (RA) treatment, also has antileukemic effects, and thus it is included in nHL therapeutic regiments [33,34].Capecitabine is a non-cytotoxic fluoropyrimidine carbamate that is taken orally; it has a synergistic effect when combined with docetaxel in colon cancer [35]; it is also used to treat locally advanced and secondary breast cancer [36].Gemcitabine is a nucleoside analogue that works by replacing the building blocks of nucleic acids during DNA elongation, leading to cell growth arrest; it has a wide range of antitumor activity as monotherapy or in combination with other anticancer agents administered in NSCLC, ovarian, metastatic breast, and pancreatic cancer [37].Eribulin, a synthetic analogue of the marine sponge natural product halichondrin B that acts as microtubule inhibitor [38], is selected for targeting tubulin beta-1 chain (TUBB1).Eribulin is used to treat metastatic breast cancer [39] and metastatic or unresectable liposarcoma.Vandetanib is an antineoplastic kinase inhibitor used to treat symptomatic or progressive medullary thyroid cancer in patients with unresectable locally advanced or metastatic disease.Vandetanib is proposed by current DR analysis as an inhibitor of the growth factor VEGFA, which is overexpressed in late MCL stages (III/IV) compared to newly diagnosed MCL cases (MCL(e)toMCL).Another drug proposed for the MCL(e)toMCL transition is regorafenib, an inhibitor of multiple kinases administered in metastatic colorectal cancer patients that have previously received anti-VEGF therapy [40], metastatic gastrointestinal stromal tumor that have been treated with imatinib [41] and hepatocellular carcinoma.
Venetoclax was also in our list of known drug combinations for MCL (Supplementary Table S1) and is used together with ibrutinib [42].In our prediction of drug combinations of known drugs as a validation of the method (Figure S10), we could not predict effective combinations for venetoclax due to the lack of sufficient connections in the PPIN.Furthermore, venetoclax is included in DrugBank and the BCL2 gene (P10415; Ubiquitin protein ligase binding) is the only listed target of venetoclax, which is known to suppress apoptosis in factor-dependent lymphohematopoietic and neural cells.The BCL2 gene is one of the up-regulated genes listed in (Supplementary Table S4; gene 118) with p-values ~10 −8 .However, BCL2 was not among the top DEGs, since it was found only in two comparisons: BtoMCL and BtoMCLe.Identifying gene signatures related to apoptosis confirms that our data include relevant markers, strengthening the link between our findings and therapeutic methods.

Exploring Drug Combinations in MCL
We wanted to explore known compound combinations by collecting antitumor regiments and effective multidrug therapies in MCL or other types of lymphoma (Supplementary Table S1B, Supplementary Figure S10B).We ended up with 84 compounds; 44 had more than two protein targets in the human PPIN and were further analyzed.We screened all nine sample comparisons regarding the therapeutics already applied for MCL.Two, eight, five, fifteen, seventeen and twenty out of twenty-eight known drug pairs are predicted to be effective in NtoIS, IStoI, ItoA, BtoMCL(e), BtoMCL and MCL(e)toMCL comparisons, respectively (Supplementary Figure S10B).
A network principal category was assigned to every pair among the 44 known MCL drugs.For BtoMCL and MCLetoMCL sample comparisons, the results are summarized as dual heatmaps in which the compounds known to be effective are highlighted (Supplementary Figure S10B).Vincristine was found to be incompatible with the other agents in the BtoMCL drug screening.These include dexamethasone, etoposide and doxorubicin added to regiments such as R-MACLO-IVAM and R-Hyper-CVAD.On the other hand, in the MCLetoMCL comparison, bortezomib was estimated to be ineffective with vincristine, doxorubicin and dexamethasone, as expected by applied strategies such as VcR-CVAD and VR-CAP cocktails.
As a next step, we investigated the pairwise efficacy of the proposed drugs resulting from the DR step at different MCL stages and sample groups.We formed nine disease modules, one per sample comparison (Table 2), each consisting of the respective top 100 DEGs; the NtoIS disease module consisted of 27 genes.We repeated the in-silico drug efficacy screening for every disease module and the respective subset of proposed drugs (Figures 6, 7 and S11).We summarize the proposed dual-drug therapies in three drug-drug networks, consisting of 9, 203 and 120 effective pairs during in situ, intermediate and aggressive MCL stages, respectively (Figure 6).Target-target networks derived from the respective drug combinations were constructed, indicating the proteins/genes to be targeted simultaneously in the potentially successful therapeutic schema.Targeting nonoverlapping gene neighborhoods is expected to be potent in treating human diseases.In the current analysis, "cannabinoid receptor 1", "suppressor of tumorigenicity 14 protein", and "urokinase plasminogen activator surface receptor" are not in the vicinity of "CCND1" or "cyclin-dependent kinase inhibitor 1", and this makes them suitable for dual-drug approaches in the in-situ stage.Dronabinol, a synthetic delta-9-THC alleviates nausea caused by cancer chemotherapy, can be combined with encorafenib or arsenic trioxide (Figure 6; NtoIS).In the aggressive stage, inhibition of HTR2A together with the enhancement of tyrosine-protein kinases BTK, BLK, LYN or FYN is expected to be beneficial.In intermediate or aggressive stages, zanubrutinib, dasatinib, fostamatinib, and aldesleukin combined with antipsychotic, antidepressants, or Parkinson's disease drugs were estimated to have synergistic value.
Finally, we screened the efficacy of 274 out of the 388 compounds, with at least two targets in the PPIN, in patients with early MCL.We constructed a network of effective target pairs, which arose from the respective drug pairs that are in the vicinity of the disease module but targeting different genes (Figure 7D).As suggested by the target-target network, the cannabinoid receptor CNR1 should be targeted alongside genes such as "CCND1", "cyclin-dependent kinase 1", "alpha-actinin-1", "apolipoprotein E", "apolipoprotein B receptor", and "tyrosine-protein kinases" BTK, BLK, SYK, FYN, LYN (Figure 7D).

Discussion
Studying cellular behavior through the prism of gene expression profiling is currently considered a standard procedure, with a wealth of data being deposited in public databases available for reanalysis and meta-analysis.However, exploitation of such data can be challenging due to the different data formats, as well as the heterogeneity of samples and conditions.We analyzed a selected number of datasets using the criterion of the availability of samples taken from healthy individuals, in addition to the well-characterized disease conditions and stages.Large heterogeneous datasets are a prerequisite for the successful investigation of cancer heterogeneity [43].High-quality Affymetrix datasets of MCL patients at early or progressed stage were included in our pipeline analysis to effectively represent sample diversity in precision medicine.Additional conditions can be analyzed in the future, such as cases harboring mutations other than CCND1 or cases where the disease has not yet developed despite the existence of a chromosomal translocation.Proper sample selection, collection, storage, and preparation is of paramount importance to avoid systematic errors and artificial results that could lead to wrong conclusions and future decisions.Additional bioinformatic tools for data processing, correcting, and integrating should also be employed in the future.
In this study, we described a modular pipeline for the analysis of transcriptomic data that combines different network-based algorithms for exploring differentially expressed genes, enriched pathways, functional modules, and potential drug therapies for MCL at distinct stages.Individual units of the pipeline explore underlying molecular information but from different points of view; the results either led to overlapping conclusions, reinforcing each other, or contributed with uniquely extracted information.Furthermore, by including different implementations of the same type of analyses, we became aware of the limitations of each method.One example is PaintOmics [19], which works with ensemble transcript IDs instead of gene symbols in the case of pathfindR [18] and does not rely on PPIN communities, and thus it accounts for transcripts and not only the protein-coding transcripts.
Taken together, the multifaceted bioinformatic pipeline and the variety of the biological samples (i.e., stepwise morphological stages of MCL or blood samples of patients at distinct stages) brought to light features of a relative abundance of transcripts that relate to the genetic complexity and aggravation of MCL.The potential role of SMAD2/3:SMAD4 and RUNX1 transcriptional regulation and the SUMOylation of transcription cofactors and DNA-related proteins during the first phase of MCL progression (Supplementary Table S5D) is pointed out by two distinct levels of analysis: the pathway enrichment analysis and the co-expression functional module alignment.Progression from early-to late-stage MCL is characterized by SPARC and VEGFA overexpression, identified by the comparative analysis and by the pathway enrichment analysis as members of the "miR-509-3p alteration of YAP1/E" and the "VEGF signaling" pathways, respectively.VEGFA's role in late-stage MCL is corroborated by the co-expression functional module analysis (Supplementary Figure S8A).Protein-coding genes have undoubtedly the lion's share in drug target discovery; however, the significance of non-coding transcripts as epigenetic modulators and switches in MCL is evident by our analysis (Supplementary Discussion).
Our approach has identified stage-specific MCL biological pathways, many of which have been previously reported in the literature to be implicated in cancer development and progression [10,11,[44][45][46].The open question is how these pathways are rewired and re-connected in the different cancer cells and cancer types and what the dynamics of this rewiring are.In early MCL, p53, Wnt, Hippo, Hedgehog and prolactin signaling pathways, together with miRNA activity, should be further examined (Supplementary Discussion).In the literature, SPARC overexpression has been correlated with six miRNAs involved in medulloblastoma progression [47], whereas the reciprocal modulation between VEGFA and SPARC proteins has been related to tumor growth [45].The exact role of these proteins in MCL pathogenesis as well as their use as biomarkers remains to be discovered (Supplementary Discussion).
PE analysis profoundly depends on the established knowledge regarding the molecular circuits in a cell and how this information is recorded in public databases.In rare and not-well-studied diseases, unknown gene networks may be present, and thus it is necessary to employ unbiased computational approaches that explore novel gene relationships characteristic of that disease.Here, we apply functional co-expression module alignment, and we present unique gene networks, characteristic of each MCL stage.
In cancer, the complexity and variability between patients is high, with multiple components participating, and thus combination therapies have higher response rates and can reduce development of drug resistance and relapse [48].Contemporary MCL therapeutic strategies usually include multiple agents, such as the first-line cytarabinecontaining induction chemotherapies (hyper-CVAD and VcR-CVAD regiments; Supplementary Table S1B) [42].Rarely is monotherapy effective in MCL, such as lenalidomide, an immunomodulator used in R/R MCL [17] (Supplementary Table S1A).
Despite its advantages, multidrug therapy may worsen certain side effects, i.e., toxicity or severe infectious complications (e.g., combinations of BCR kinase inhibitors with chemoimmunotherapy regimens [49]).This underscores the necessity of predicting potentially effective drug combinations and how a drug may interfere with or modulate the action of another drug.
We summarize the potential therapeutic agents per MCL stage, and we propose effective pairwise combinations for further experimental validation (Figures 5 and 6).In cancer, early diagnosis and treatment usually improve the chances of survival and reduce complications, drug resistance and relapse.However, limited treatment options were found for in situ MCL (e.g., encorafenib), whereas more treatment options were proposed for intermediate and aggressive MCL (e.g., aldesleukin, fostamatinib, vincristine, dasatinib).
Antipsychotic drugs, mainly targeting upregulated HTR2A, occupy a large part of the proposed drugs.It has been found that patients treated for schizophrenia have lower incidence of certain types of cancer (respiratory, prostate, bladder cancers [50]) and that serotonin receptor signaling may modulate anti-tumor immunity [51].As demonstrated, serotonin receptor mRNAs and proteins are expressed across diverse cancer types, and pharmacological inhibition of 5-HT receptors leads to activation of the p53 DNA damage pathway and suppression of MAPK activity [52].
Synergistic enhancement of cancer therapy uses combinations of anti-tumor drugs to reduce toxicity and relapse cases; regiments such as R-hyper-CVAD are standard treatments for MCL (Supplementary Table S1B).We examined efficient drug pairs at different MCL stages by implementing a drug-disease network distance approach [17] and selecting drug pairs assigned to the "complementary exposure" network configuration (Supplementary Figure S10A).We were able to draw conclusions regarding proposed target gene combinations (Figures 6 and 7D), e.g., serotonin or cannabinoid receptors with tyrosine kinases LYN, SYN, FYN, BTK, BLK.These findings supported by our in-silico analysis should be further validated and studied for their clinical impact.
In conclusion, our modular pipeline allowed for the comprehensive exploration of MCL transcriptomic data, which led to a more complete set of potential therapeutic targets.We were able to find individual DEGs in healthy vs. disease conditions, their functional modules and MCL pathogenesis-involved pathways at distinct stages.Moreover, our pipeline examined potential drug therapies, detecting drugs already used for MCL (e.g., lenalidomide) or other types of lymphoma (e.g., pomalidomide, belinostat) as a proof of concept, as well as newly proposed ones.Fostamatinib is proposed for all three MCL stage transitions (Figures 4 and 5) via protein targets, such as SYK, BTK, MAP2K2 or PTK2B, that are affected indirectly.Our work highlights the benefit that can be leveraged by such an approach for identifying high-priority novel, efficacious combination therapies against MCL that should be further validated for their clinical impact.
Despite the advantages of our proposed pipeline, we should emphasize the need for experimental validation of the findings using a combination of approaches (e.g., in vitro and in vivo) and current limitations: (a) prediction of effective drug pairs is limited by the current knowledge of drug-protein and protein-protein interactions that define the structure and density of the PPIN used; (b) gene co-expression networks do not directly imply causal relationships (i.e., cause-and-effect relationships of gene expression); hence, with this method, we are not able to draw conclusions on the origin of the malignancy or design targeted therapies; (c) the proposed pipeline necessitates the preexistence of a batch effect-removal model trained specifically on the platforms used (currently available only for the Affymetrix U133A).From our experience, the removal of batch effect is critical for the correct DEGs to "shine" and removes irrelevant distances produced by tissue type, sex and other aberrations.

Methods
We applied network-based pathway enrichment algorithms [18,19] and inference methods [53] to independent gene expression data [10,12].We integrated and reexamined four high-fidelity RNA array experiments available in public databases (ArrayExpress, Gene Expression Omnibus-GEO) including healthy donor samples (Figure 1).The steps of the computation framework were: (1) identification of DEGs, (2) characterization of the significantly affected molecular pathways using pathway enrichment analysis [18,19], (3) construction of gene co-expression networks for the different MCL stages, and 4) exploration of new treatments for MCL stages using available drug repositories (Figure 1).
Except for the in-situ stage, the three final stages were also compared with the normal state (NtoIS, NtoI, NtoA; Table 1).

Co-Expression Network Construction
We applied mutual information network inference methods to construct co-expression networks of different sample groups.The MRNETB (Maximum Relevance Minimum Redundancy Backward) network inference algorithm of the minet R/Bioconductor package [56] was employed, where the mutual information is an estimation based on k nearest neighbors, as proposed by Krakov et al. [57] and implemented in the parmigene R package.The co-expression networks of the sample groups were constructed, starting with the batch effect corrected and quantile-normalized transcriptional profiles.First, the mutual information between every pair of transcripts was calculated using entropy estimates from K-nearest-neighbor distances [57].The estimated mutual information values were not normalized via Fisher's z transformation as described in Zhang et al.; they were used as an input for the MRNETB algorithm to infer the networks using the maximum relevance/minimum redundancy criterion [56].Hard thresholding was applied to transform the weighted networks to unweighted by taking a quantile as a threshold, below which the values were set to zero.The quantile thresholds were different for each co-expression network, ranging from 93% to 98%.Therefore, co-expression networks were represented by their adjacency matrices A k where A k (i, j) = 1 when there is an edge between genes i and j.Prior to module identification, the vertices, which were unconnected in both networks, were removed.

Identifying Functional Modules in Co-Expression Networks
The co-expression networks were aligned in terms of functional modules using the spectral clustering approach of Zhang et al. [24].Spectral clustering is a graph-based, unsupervised learning method that incorporates other known clustering algorithms, usually k-means.It performs with the same time module identification and alignment between different networks (here co-expression networks).Transcription values remain unlabeled for the clustering algorithm relative to the co-expression network they originate from and are grouped to a given number of clusters (k).This spectral clustering approach integrates adjacency matrices of more than two networks, and hence it can compare more than two networks.Each gene is represented as a node as many times as the number of networks compared (here, we focused only on pairs of co-expression networks).A cluster in the first network and its counterpart cluster in the second network may include different genes, but the final modules are constructed by the union of those genes.The number of modules, i.e., the number of clusters, is decided based on the eigenvalue gap (Supplementary Figure S7).We confined the size of the aligned modules from 8 to 800 nodes (Supplementary Figure S5), and we excluded the unconnected subnetworks from the final modules.As a means of representing the changes between the two networks, we used the fold of change of the co-expression value that was assigned to the edges (Figure 3, Supplementary Figure S8).Co-expression modules were visualized and merged using Cytoscape software (v3.9.1) [58].Finally, we sought to find the enriched terms in the identified modules, and so we performed GSEA analyses using the clusterProfiler R package.We searched for enriched Wiki and KEGG pathways, but also Gene Ontology terms (biological process and molecular function) (Supplementary Table S7A).

Drug Repurposing
We explored new potential treatments for each MCL stage using the drug-target information stored in DrugBank database [26].Totally, 15,163 drug-gene interactions of approved drugs were available, of which 8680 were of type "target" (DrugBank organizes drug-gene interactions into four main types: target, carrier, transporter, and enzyme).In 3199 of the drug-target interactions, the action of the drug upon the target was characterized (e.g., inhibitor, partial agonist, substrate, binder).The drug actions are accessible through the full xml DrugBank database (32 types).To define the proposed drugs, we first selected the top 100 DEGs of each sample group comparison (Supplementary Figure S5).Then, drugtarget edges were filtered out in case the type of action was the opposite of the preferred one, as dictated by the fold-change direction (Supplementary Figure S5f; Supplementary Methods).In cases of significantly up-regulated genes, targeting drugs that enhance their activity (i.e., activators, agonists, partial agonists) were excluded from the list.Contrarily, if a gene was down-regulated, then "inhibitors" and "antagonists" were not considered as potential treatments.

Assessing Drug Combination Efficacy
Combination therapies has been proven more effective than monotherapy regimens.They allow lower doses and reduce the risk of or adverse effect in patients [42,59].We sought to explore synergistic drug pairs in MCL by implementing a recently proposed network-based stratification method [17] that was tested on antihypertensive drug combinations.This method is based on network proximity metrics that reflect the topological overlap between drug modules and disease modules within the human protein-protein interaction network (PPIN).
The in-silico screening of drug-pair efficacy described by Cheng et al. defines two network proximity measures, the separation (s) and the z-score, to estimate the closeness of drugs and of drug and disease modules, respectively.They define the disease module as the set of genes that are related to a certain disease.These genes are usually found in proximity to the human PPIN.Both separation (s) and the z-score (z) distance measures rely on the calculation of shortest paths between pairs of proteins within the human PPIN (Supplementary Methods).The value of the separation (s) measure reflects the proximity of two drugs: if negative, the two drugs have overlapping targets; otherwise, they have distinct targets.In the same manner, when the value of the z-score is negative, then the drug targets proteins of the disease module.After calculating the separation values between all pairs of drugs of interest and the respective z-scores between drugs and the disease module, the method decides for each pair one of the six network principles based on the network proximity between the drug's targets and the disease module (Supplementary Figure S10A).The network principle is decided based on three values: the separation (s AB ) of drugs A and B and the z-scores z A and z B between drugs A, B, and the disease module (Supplementary Figure S10A).As concluded by Cheng et al. [17], the most effective drug combinations fall under the second principle, the "complementary exposure", where both drugs overlap with the disease module, but not with each other (Figure S10A).
For the analysis, we used the human PPIN network provided by Cheng et al. [17], and we employed an in-house implementation of the Dijkstra algorithm to calculate the shortest paths between nodes in the network.We defined four disease modules, each consisting of the top 200 DEGs of the between-stage comparisons (NtoIS, IStoI, ItoA).Of the 388 proposed drugs, we discarded those with fewer than two targets present in the human PPIN for reasons of statistical validity (275 remained).Multidrug therapeutic strategies currently used in the treatment of MCL were collected from the Drug Combination Database (DCDB) [60] and from the literature (Supplementary Table S1B).

Figure 1 .
Figure 1.Computational analysis pipeline.Transcriptional data obtained measuring RNA levels of tissue or blood samples of MCL patients and normal cases following a modular pipeline of network-based methods.Five microarray studies were revisited.Kimura et al. (GSE30189) isolated tissue samples of 17 CCND1-positive MCL patients at four stepwise distinct stages (in situ, classical, intermediate and aggressive) and four normal mantle zone B lymphocytes samples.Espinet et al. (GSE45717) collected PB B cells of five typical MCL patients and eight healthy individuals and obtained transcriptional data using Affymetrix Human Exon 1.0 ST arrays.Three Affymetrix Human U133 Plus 2 datasets were also analyzed.Leshchenko et al. (GSE19243) purified CD19+ fractions from peripheral blood of newly diagnosed MCL patients (five selected).Hartmann et al. extracted RNA

Figure 3 Figure 3 .
Figure 3Figure 3.Co-expression network spectral clustering.Gene co-expression networks at different MCL stages were constructed using the MRNETB algorithm[25].Modules of co-expressed genes were identified and aligned between networks using a spectral clustering approximation approach[23,24].The union of the aligned modules between (A) normal and in situ, (B) in situ and intermediate, (C) intermediate and aggressive are depicted.Up-and down-regulated genes are depicted as regular or flipped triangles, respectively.The log 2 fold change of the co-expression between MCL stages is used as a weight of the edges.Loss of co-expression is depicted with green (negative log 2 fold change), whereas gain of co-expression with red (positive log 2 fold change).As a result, "green communities" are genes where co-expression relationships are weakened as the MCL progresses to the next stage.The node border line color depicts the Wiki pathway in which a gene participates.A gene can participate in more than one pathway, but here we show only selected pathways (orange = apoptosis, cyan = cytoplasmic ribosomal proteins, magenta = MAPK signaling pathway, blue = PI3K-Akt signaling pathway, yellow = VEGFA-VEGFR2 signaling pathway, brown = cell cycle, purple = TGF-beta signaling pathway).

Figure 4 .
Figure 4. Proposed drugs for the distinct stages of MCL.Bipartite drug-gene graph of the 388 FDA-approved drugs proposed in the current analysis for the treatment of MCL.The potential drugs were obtained using DrugBank's[26] catalogue of FDA-approved drugs and their gene/protein targets.Drug-gene pairs were selected based on the top DEGs identified in nine sample group comparisons (Table1) and filtered given the type of action.Genes and drugs are depicted as circles and diamonds/triangles, respectively.Two types of drugs exist in the DrugBank database[26], the small molecules (diamonds) and the biotech molecular entities (arrowheads).Drugs are colored based on their original indication or clinical trial: MCL drugs (cyan), T-cell lymphomas (orange), other lymphoma/leukemia subtypes (green), drugs at clinical trial for the treatment of MCL[28] (yellow, drugs indicated in other types of cancer (brown) and nutraceutical substances (white).Each gene circle is a donut chart, where each donut slice represents one of the nine sample group comparisons where the gene was found to be significantly deregulated.Donut slices (i.e., sample comparisons) have the following color code: comparisons of the three MCL stages to the normal case (NtoIS, NtoI, NtoA) are gray-shaded with darker shades corresponding to more progressed MCL stage; light and dark teal represent IStoI and ItoA transitions; light and dark purple correspond to BtoMCL and BtoMCL(e) comparisons, whereas magenta corresponds to MCL(e)toMCL and blue to HtoMCLx.Finally, drug-gene relationships are characterized by the type of drug action (inhibition, agonist, antagonist etc.), and are represented either with different target arrow shapes or line types.

Figure 5 .
Figure 5. Synopsis of stage-specific pathways and the proposed MCL treatments, summarizing biological pathways that participate in MCL progression along with proposed treatments, from benign lymphadenitis to in situ, intermediate, and finally aggressive stage.Proposed drug treatments are colored or grouped based on their previous indication(s) or their clinical trial state regarding MCL or lymphoma treatment ([C]: at clinical trial).As proof of our approach, a considerable number of compounds are indicated in MCL (cyan), T-cell lymphoma (orange), other lymphomas (green), leukemia (gray), R/R lymphoma, or other cancer treatments.Some of them are at clinical trials for the treatment of MCL (magenta) or other subtypes of lymphoma (blue).Our approach also identifies groups of compounds originally prescribed in other diseases such as schizophrenia or bipolar disease (DD: depressive disorder; RA: rheumatoid arthritis or osteoarthritis; PD: Parkinson's disease).

Figure 7 .
Figure 7. Prediction of effective compound combinations in early and progressed MCL patients.Dual heatmaps present both the network proximity s of drug-drug pairs (lower half) and the network principle of the drug pair given a disease module (upper half).Three drug combination analyses are summarized here, each differing in the selected disease module and the list of proposed drugs assessed: (A) BtoMCL (B) BtoMCL(e) and (C) MCL(e)toMCL.In each analysis, the disease module is defined as the set of top 100 DEGs identified in the respective sample comparison, while only the pairs of the respective proposed drugs are assessed (51, 23, and 66 drugs respectively).Hierarchical clustering of drugs is based on the separation (s) values and clusters are colored in gray

Author
Contributions: K.P., M.A. and G.O. substantially contributed to the conception and design of the study; G.O. performed the bioinformatic analysis.All authors have read and agreed to the published version of the manuscript.Funding: This research was funded by the project Advanced Research Activities in Biomedical and Agro-alimentary Technologies (MIS 5002469) which is implemented under the Action for the Strategic Development on the Research and Technological Sector, funded by the operational program Competitiveness, Entrepreneurship and Innovation (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).Konstantina Psatha was funded by a

Table 1 .
Datasets used in gene co-expression network analysis.

Table 2 .
Number of enriched pathways per MCL stage transition and MCL vs. healthy white blood cells.